15 research outputs found

    Constrained Optimization of Rank-One Functions with Indicator Variables

    Full text link
    Optimization problems involving minimization of a rank-one convex function over constraints modeling restrictions on the support of the decision variables emerge in various machine learning applications. These problems are often modeled with indicator variables for identifying the support of the continuous variables. In this paper we investigate compact extended formulations for such problems through perspective reformulation techniques. In contrast to the majority of previous work that relies on support function arguments and disjunctive programming techniques to provide convex hull results, we propose a constructive approach that exploits a hidden conic structure induced by perspective functions. To this end, we first establish a convex hull result for a general conic mixed-binary set in which each conic constraint involves a linear function of independent continuous variables and a set of binary variables. We then demonstrate that extended representations of sets associated with epigraphs of rank-one convex functions over constraints modeling indicator relations naturally admit such a conic representation. This enables us to systematically give perspective formulations for the convex hull descriptions of these sets with nonlinear separable or non-separable objective functions, sign constraints on continuous variables, and combinatorial constraints on indicator variables. We illustrate the efficacy of our results on sparse nonnegative logistic regression problems

    Data-driven Inverse Optimization with Imperfect Information

    Full text link
    In data-driven inverse optimization an observer aims to learn the preferences of an agent who solves a parametric optimization problem depending on an exogenous signal. Thus, the observer seeks the agent's objective function that best explains a historical sequence of signals and corresponding optimal actions. We focus here on situations where the observer has imperfect information, that is, where the agent's true objective function is not contained in the search space of candidate objectives, where the agent suffers from bounded rationality or implementation errors, or where the observed signal-response pairs are corrupted by measurement noise. We formalize this inverse optimization problem as a distributionally robust program minimizing the worst-case risk that the {\em predicted} decision ({\em i.e.}, the decision implied by a particular candidate objective) differs from the agent's {\em actual} response to a random signal. We show that our framework offers rigorous out-of-sample guarantees for different loss functions used to measure prediction errors and that the emerging inverse optimization problems can be exactly reformulated as (or safely approximated by) tractable convex programs when a new suboptimality loss function is used. We show through extensive numerical tests that the proposed distributionally robust approach to inverse optimization attains often better out-of-sample performance than the state-of-the-art approaches

    Wasserstein Distributionally Robust Learning

    No full text
    Many decision problems in science, engineering, and economics are affected by uncertainty, which is typically modeled by a random variable governed by an unknown probability distribution. For many practical applications, the probability distribution is only observable through a set of training samples. In data-driven decision-making, the goal is to find a decision from the training samples that will perform equally well on unseen test samples. In this thesis, we leverage techniques from distributionally robust optimization to address decision-making problems in statistical learning, behavioral economics and estimation problems. In particular, Wasserstein distributionally robust optimization is studied where the decision-maker learns decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples. In the first part of the thesis we study regression and classification methods in supervised learning from the distributionally robust perspective. In the classical setting the goal is to minimize the empirical risk, that is, the expectation of some loss function quantifying the prediction error under the empirical distribution. When facing scarce training data, overfitting is typically mitigated by adding regularization terms to the objective that penalize hypothesis complexity. We introduce new regularization techniques using ideas from distributionally robust optimization, and we give new probabilistic interpretations to existing techniques. In the second part of the thesis we consider data-driven inverse optimization problems where an observer aims to learn the preferences of an agent who solves a parametric optimization problem depending on an exogenous signal. Thus, the observer seeks the agent's objective function that best explains a historical sequence of signals and corresponding optimal actions. We focus here on situations where the observer has imperfect information, that is, where the agent's true objective function is not contained in the search space of candidate objectives, where the agent suffers from bounded rationality or implementation errors, or where the observed signal-response pairs are corrupted by measurement noise. We formalize this inverse optimization problem as a distributionally robust program minimizing the worst-case risk that the predicted decision (i.e., the decision implied by a particular candidate objective) differs from the agent's actual response to a random signal. In the final part of the thesis we study a distributionally robust mean square error estimation problem over a nonconvex Wasserstein ambiguity set containing only normal distributions. We show that the optimal estimator and the least favorable distribution form a Nash equilibrium. Despite the nonconvex nature of the ambiguity set, we prove that the estimation problem is equivalent to a tractable convex program. We further devise a Frank-Wolfe algorithm for this convex program whose direction-searching subproblem can be solved in a quasi-closed form. Using these ingredients, we introduce a distributionally robust Kalman filter that hedges against model risk

    Evolving Takagi-Sugeno model based on online Gustafson-Kessel algorithm and kernel recursive least square method

    No full text
    In this paper, we introduce an evolving system utilizing sparse weighted kernel least square as local models and online Gustafson-Kessel clustering algorithm for structure identification. Our proposed online clustering algorithm forms elliptical clusters with any orientation which leads to creating less but more complex shape clusters than spherical ones. Moreover, the clustering algorithm is able to determine number of required clusters by adding new clusters over time and to reduce the redundancy of model by merging similar clusters. Additionally, we propose weighted kernel recursive least square method with a new sparsification procedure based on instant prediction error. Also, we introduce an adaptive gradient-based rule for tuning kernel size. The sparsification procedure and adaptive kernel size improve the performance of kernel recursive least square, significantly. To illustrate our methodology, we apply the introduced model to online identification of a time varying and nonlinear system. Finally, to show the superiority of our approach in comparison to some known online approaches, two different time series are considered: Mackey-Glass as a benchmark and electrical load as a real-world time series

    Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution

    No full text
    Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove that computing the Wasserstein distance between a discrete probability measure supported on two points and the Lebesgue measure on the standard hypercube is already #P-hard. This insight prompts us to seek approximate solutions for semi-discrete optimal transport problems. We thus perturb the underlying transportation cost with an additive disturbance governed by an ambiguous probability distribution, and we introduce a distributionally robust dual optimal transport problem whose objective function is smoothed with the most adverse disturbance distributions from within a given ambiguity set. We further show that smooth- ing the dual objective function is equivalent to regularizing the primal objective function, and we identify several ambiguity sets that give rise to several known and new regularization schemes. As a byproduct, we discover an intimate relation between semi-discrete optimal transport problems and discrete choice models traditionally studied in psychology and economics. To solve the regularized optimal transport problems efficiently, we use a stochastic gradient descent algorithm with imprecise stochastic gradient oracles. A new convergence analysis reveals that this algorithm improves the best known convergence guarantee for semi-discrete optimal transport problems with entropic regularizers

    Regularization via Mass Transportation

    No full text
    The goal of regression and classification methods in supervised learning is to minimize the empirical risk, that is, the expectation of some loss function quantifying the prediction error under the empirical distribution. When facing scarce training data, overfitting is typically mitigated by adding regularization terms to the objective that penalize hypothesis complexity. In this paper we introduce new regularization techniques using ideas from distributionally robust optimization, and we give new probabilistic interpretations to existing techniques. Specifically, we propose to minimize the worst-case expected loss, where the worst case is taken over the ball of all (continuous or discrete) distributions that have a bounded transportation distance from the (discrete) empirical distribution. By choosing the radius of this ball judiciously, we can guarantee that the worst-case expected loss provides an upper confidence bound on the loss on test data, thus offering new generalization bounds. We prove that the resulting regularized learning problems are tractable and can be tractably kernelized for many popular loss functions. We validate our theoretical out-of-sample guarantees through simulated and empirical experiments

    Methoden zur Planung von hierarchischen Hochgeschwindigkeitsnetzen (MAN)

    No full text
    Metropolitan Area Networks (MANs) have been developed to follow up Local Area Networks (LANs) as the next generation of high speed communication networks. MANs based on the IEEE 802.6 standard have already been introduced in the public sector as both a network for early deployment of broadband services and later as access network for Broadband-ISDN. It is expected that due to the growing demand of broadband services there is a need to apply MANs in wider geographical areas, i.e. up to several hundred kilometers, to supply a large number of customers with broadband facilities. Building up a broadband network infrastructure requires large investments. Therefore, a network operator depends on a systematic planning and design process in order to make efficient use of resources like transmission lines and networking equipment. The introduction focuses on some of the major trends in data communications of public networks and gives an outline of the following chapters. Chapter 2 introduces the basic concepts of Local and Metropolitan Area Networks such as switching techniques, media access control, routing and LAN-MAN-interworking. The main focus is on the architecture of MAN subnetworks according to the DQDB standard. Chapters 3 is mainly dedicated to modeling aspects of communication networks and optimization techniques suited for the topological design of MANs. Modeling aspects include traffic models as well as graphs. Moreover, some theoretical background for the MAN planning and design process is provided in this chapter. Typically, topological design problems can be formulated as a combinatorial optimization problem. Planning Models are a prerequisite for effectively solving the planning problem. Chapter 4 presents a planning model which can be formulated as a non-linear integer optimization problem. The objective is to design a cost-effective network structure fulfilling Quality of Service requirements. In order to solve the MAN design problem presented in chapter 4, heuristic algorithmis have been developed. Chapter 5 provides a detailed description of three MAN design methods as well as a case study for their evaluation. The first algorithm is a decomposition approach which consists of two parts: A Clustering and Load sharing Algorithm (CLA) provides for an initial assignment of MAN stations to subnetworks. The second algorithm is based on Simulated Annealing which has been proved as an efficient method in many areas of combinatorial optimization. Genetic Algorithms from the basis of the third method. Based on experience with simpler planning problems, an efficient coding scheme and appropriate genetic operators have been designed. Chapter 6 concentrates on the MAN planning tool which is based on the methods described in chapter 5. This includes a presentation of the functionality and the modular software structure of the tool which allows for further extensions such as additional analysis methods. Chapter 7 summarizes the main results and gives an outlook to possible extensions of the planning methods for other networks types. (orig./HK)SIGLEAvailable from TIB Hannover: RA 2233(64) / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekDEGerman
    corecore